Zhaoran Wang

I am an associate professor in the Departments of Industrial Engineering & Management Sciences and Computer Science at Northwestern University. I am also with the Centers for Deep Learning and Optimization & Statistical Learning.

The long-term goal of my research is to develop a new generation of data-driven decision-making methods, theory, and systems, which tailor artificial intelligence towards addressing societal challenges. To this end, my research aims at:
  • making autonomous learning agents more efficient, both computationally and statistically, in a principled manner to enable their critical applications;
  • designing and optimizing societal-scale multi-agent systems, especially those involving cooperation and/or competition among humans and/or robots.
With this aim in mind, my research interests span across machine learning, optimization, statistics, game theory, and information theory.

Selected Recent Papers [Overview] [Conference] [Journal] [Citation]

Reason for Future, Act for Now: A Principled Architecture for Autonomous LLM Agents
Zhihan Liu, Hao Hu, Shenao Zhang, Hongyi Guo, Shuqi Ke, Boyi Liu, Zhaoran Wang
International Conference on Machine Learning (ICML), 2024
[Arxiv] [Demo] [GitHub]
Provably Mitigating Overoptimization in RLHF: Your SFT Loss is Implicitly
an Adversarial Regularizer

Zhihan Liu, Miao Lu, Shenao Zhang, Boyi Liu,
Hongyi Guo, Yingxiang Yang, Jose Blanchet, Zhaoran Wang

Advances in Neural Information Processing Systems (NeurIPS), 2024
[Arxiv]
Maximize to Explore: A Single Objective Fusing Estimation, Planning, and Exploration
Zhihan Liu, Miao Lu, Wei Xiong, Han Zhong, Hao Hu,
Shenao Zhang, Sirui Zheng, Zhuoran Yang, Zhaoran Wang

Advances in Neural Information Processing Systems (NeurIPS), 2023 (spotlight)
[Arxiv]
Embed to Control Partially Observed Systems: Representation Learning with
Provable Sample Efficiency

Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang
International Conference on Learning Representations (ICLR), 2023
[Arxiv]
Reinforcement Learning from Partial Observation: Linear Function Approximation with
Provable Sample Efficiency

Qi Cai, Zhuoran Yang, Zhaoran Wang
International Conference on Machine Learning (ICML), 2022
[Arxiv]
A Two-Timescale Framework for Bilevel Optimization: Complexity Analysis and
Application to Actor-Critic

Mingyi Hong, Hoi-To Wai, Zhaoran Wang, Zhuoran Yang (alphabetical)
SIAM Journal on Optimization (SIOPT), 2022
[Arxiv]
Is Pessimism Provably Efficient for Offline RL?
Ying Jin, Zhuoran Yang, Zhaoran Wang
International Conference on Machine Learning (ICML), 2021
Mathematics of Operations Research (MOR), 2024
[Arxiv]
Provably Efficient Causal Reinforcement Learning with Confounded Observational Data
Lingxiao Wang, Zhuoran Yang, Zhaoran Wang
Advances in Neural Information Processing Systems (NeurIPS), 2021
[Arxiv]
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng Zhang, Qi Cai, Zhuoran Yang, Yongxin Chen, Zhaoran Wang
Advances in Neural Information Processing Systems (NeurIPS), 2020 (oral)
[Arxiv]
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
Yingjie Fei, Zhuoran Yang, Yudong Chen, Zhaoran Wang, Qiaomin Xie
Advances in Neural Information Processing Systems (NeurIPS), 2020 (spotlight)
[Arxiv]
Pontryagin Differentiable Programming: An End-to-End Learning and Control Framework
Wanxin Jin, Zhaoran Wang, Zhuoran Yang, Shaoshuai Mou
Advances in Neural Information Processing Systems (NeurIPS), 2020
[Arxiv] [Demo]
Provably Efficient Exploration in Policy Optimization
Qi Cai, Zhuoran Yang, Chi Jin, Zhaoran Wang
International Conference on Machine Learning (ICML), 2020
[Arxiv]
Learning Zero-Sum Simultaneous-Move Markov Games Using Function Approximation
and Correlated Equilibrium

Qiaomin Xie, Yudong Chen, Zhaoran Wang, Zhuoran Yang
Annual Conference on Learning Theory (COLT), 2020
Mathematics of Operations Research (MOR), 2023
[Arxiv]
Provably Efficient Reinforcement Learning with Linear Function Approximation
Chi Jin, Zhuoran Yang, Zhaoran Wang, Michael Jordan
Annual Conference on Learning Theory (COLT), 2020
Mathematics of Operations Research (MOR), 2023
[Arxiv]
Neural Policy Gradient Methods: Global Optimality and Rates of Convergence
Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang
International Conference on Learning Representations (ICLR), 2020
[Arxiv]
Neural Proximal/Trust Region Policy Optimization Attains Globally Optimal Policy
Boyi Liu, Qi Cai, Zhuoran Yang, Zhaoran Wang
Advances in Neural Information Processing Systems (NeurIPS), 2019
[Arxiv]
Neural Temporal-Difference and Q-Learning Provably Converge to Global Optima
Qi Cai, Zhuoran Yang, Jason Lee, Zhaoran Wang
Advances in Neural Information Processing Systems (NeurIPS), 2019
Mathematics of Operations Research (MOR), 2024
[Arxiv]
A Theoretical Analysis of Deep Q-Learning
Jianqing Fan, Zhaoran Wang, Yuchen Xie, Zhuoran Yang (alphabetical)
Submitted, 2020
[Arxiv]
Acknowledgement: National Science Foundation (Awards 2235451, 2225087, 2211210, CAREER-2048075, 2015568, 2008827, 1934931/2216970), Simons Institute (Theory of Reinforcement Learning),
Amazon, J.P. Morgan, Two Sigma, Tencent